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Summary:  It  is  shown  that  under  certain 

natural  conditions  the  play  of  a  multi-stage 
game  is  to  a  great  extent  independent  of 
the  payoff  function. 


ON  MULTI-STAGE  GAMES  WITH  IMPRECISE  PAYOFF 
Richard  Bellman 


51.  Introduction 

There  is  a  large  class  of  situations  of  economic  and  mili¬ 
tary  significance  which  can  he  considered  to  be  multi-stage  games. 
In  many  of  these  situations,  the  payoff  function  is  easily  deter¬ 
mined;  in  others,  it  is  a  matter  of  difficulty  to  determine 
a  suitable  criterion. 

The  purpose  of  this  note  is  to  show.  In  a  heuristic  fashion, 
that  in  many  cases  the  optimal  play  is  independent  of  the  precise 
form  of  the  payoff,  provided  only  that  this  payoff  possesses  cer¬ 
tain  intuitive  properties. 

£2.  Description  of  the  Multi-stage  Game 

Let  us  consider  a  zero-sum  multi-stage  game  where  each  play 
is  determined  by  the  game  matrix  A  -  (a^).  Let  initially  the 
first  player  possess  a  quantity  x  of  resources  and  the  second 
player  a  quantity  y.  Since  the  game  is  zero  sum,  the  state  of 
the  game  at  any  time  is  described  by  x. 

Defining  a  suitable  criterion  function,  let  f(x)  represent 
the  value  of  the  game  to  the  first  player. 
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Then,  f(x)  satisfies  the  functional  equation 


f(x)  -  Min  Max  ^  p.q,  f  (x  +  a. .) 
q  P  ITT  1  J 


(2.1) 


Max  Min 
P  Q 


Ptqj  r(x  ♦  ,tj) 


(see  QQ ,  [^).  The  quantities  and  q^  will,  in  general, 
depend  upon  x. 

§3.  Assumptions  Concerning  f(x) 

Let  us  now  assume  that  x  and  y  are  large  compared  to  a^j. 

In  other  words,  the  state  of  the  system  is  only  slightly  disturbed 
by  any  one  play  of  the  game.  Furthermore ,  let  us  assume  that  the 
value  of  the  matrix  A  is  zero,  which  is  to  say,  it  is  on  the  aver¬ 
age  a  fair  game.  Otherwise,  the  play  is  relatively  trivial. 

Finally,  we  assume  that  it  pays  to  start  with  a  larger  ini¬ 
tial  resource.  Then 


f'(x)  >  0 


(3.1) 


Heuristic  Conclusion 
Under  these  assumptions, 
rigorously  that  the  p^  and  qj 
x  and  f(x).  This  means  that. 


we  wish  to  show  plausibly  but  not 
are  approximately  independent  of 
under  these  assumptions,  on  each 
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play  the  players  attempt  to  maximize  the  single-stage  return,  the 
ordinary  expected  value. 

Let  us  write 


f(x  +  a±J)  -  f(x)  +  alJf«(x)  (4.1) 

Then,  from  (2.1), 

f (x)  -  Min  Max  p±qj  £  f (x)  +  aA jf » (x) ^  (4.2) 


or 


i 


f(x)  -  f(x) 


p.q.  +  Min  Max  f 1  (x)/, 
J  Q  P 


aUPlqj 


(4.3) 


whence,  since  f'(x)  f  0, 


0  -  Min  Max 

q  p 


aUpiqJ 


-  Max  Min 


(4.4) 


35-  Nonzero— aum  Games 

Let  us  consider  a  two— person,  multi— stage ,  nonzero— sum  game 
where  the  first  and  second  players  have,  respectively,  the  game 
matrices 


I 


A  -  (aij)»  B  "  (bij) 


(5.1) 


—4— 


I  RM-1337 


and  initially  the  amounts  x  and  y,  respectively. 

Let  f(x,y)  be  some  criterion  function,  such  as  probability 
of  survival,  assumed  to  satisfy  the  conditions 


fx  >  0,  f  <  0 


(5.2) 


and  assume  that 


aupl V  £  bUpi*}  <  0 


(5.3) 


a  game  of  attrition. 


Then  f(x,y)  satisfies  the  functional  equatl< 


f(x,y)  -  Max  Min 
P  Q 

*  Min  Max 

q  p 


piqJ  f^x+aij'  y+bij)  »  x»y  >  0 

[  •  •  •  J 


x  >  0,  y  <  0 
°»  x  <  0,  y  >  0 
1/2,  x  ■  y  -  0  (for  the  sake  of 


(5.4) 


completeness ) 


AaaUm'  as  above  that  x  and  y  are  large  compared  to  a 


Then 


u  and  hr 


f(x+au,  y+bjj)^  f(x,y)  +  +  bljfy 


(5.5) 
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Equation  (5-^)  then  yields 

0  **  Max  Min  £  fx  T"  a1JP1qJ  +  r  2Z  (5-6) 

P  4.  1  9  J 

-  Min  Max  C  *  '  *  H 

q  P 


or 


«  Min  Max 

P  q 


alJPlQJ 


L 


>UpiqJ 


(5-7) 


Max  Min 


51  aUPi<i.i 


This  shows  that  the  single-stage  play  is  approximately  governed 
by  the  criterion  function 


(5-8) 


_  3-4  4P4Q4 

K(p,q)  -  -  ^  1  J 


bijPlqJ 


That  min-max  -  rnax-min  in  (5*7)  is  a  result  due  to  von  Neumann. 
An  elegant  short  proof  based  on  the  usual  min-max  theorem  will 
be  found  in  [V)  . 

We  thus  have  a  rationale  for  the  play  of  large  classes  of 
two-person  nonzero— sum  games. 
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